AITopics | surface information

Collaborating Authors

surface information

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DS-ProGen: A Dual-Structure Deep Language Model for Functional Protein Design

Li, Yanting, Jiang, Jiyue, Wang, Zikang, Lin, Ziqian, He, Dongchen, Shan, Yuheng, Shao, Yanruisheng, Li, Jiayi, Shi, Xiangyu, Wang, Jiuming, Chen, Yanyu, Fan, Yimin, Li, Han, Li, Yu

arXiv.org Artificial IntelligenceMay-20-2025

Inverse Protein Folding (IPF) is a critical subtask in the field of protein design, aiming to engineer amino acid sequences capable of folding correctly into a specified three-dimensional (3D) conformation. Although substantial progress has been achieved in recent years, existing methods generally rely on either backbone coordinates or molecular surface features alone, which restricts their ability to fully capture the complex chemical and geometric constraints necessary for precise sequence prediction. To address this limitation, we present DS-ProGen, a dual-structure deep language model for functional protein design, which integrates both backbone geometry and surface-level representations. By incorporating backbone coordinates as well as surface chemical and geometric descriptors into a next-amino-acid prediction paradigm, DS-ProGen is able to generate functionally relevant and structurally stable sequences while satisfying both global and local conformational constraints. On the PRIDE dataset, DS-ProGen attains the current state-of-the-art recovery rate of 61.47%, demonstrating the synergistic advantage of multi-modal structural encoding in protein design. Furthermore, DS-ProGen excels in predicting interactions with a variety of biological partners, including ligands, ions, and RNA, confirming its robust functional retention capabilities.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.12511

Country:

Asia > China > Hong Kong (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Singapore (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Paved or unpaved? A Deep Learning derived Road Surface Global Dataset from Mapillary Street-View Imagery

Randhawa, Sukanya, Aygun, Eren, Randhawa, Guntaj, Herfort, Benjamin, Lautenbach, Sven, Zipf, Alexander

arXiv.org Artificial IntelligenceOct-29-2024

We have released an open dataset with global coverage on road surface characteristics (paved or unpaved) derived utilising 105 million images from the world's largest crowdsourcing-based street view platform, Mapillary, leveraging state-of-the-art geospatial AI methods. We propose a hybrid deep learning approach which combines SWIN-Transformer based road surface prediction and CLIP-and-DL segmentation based thresholding for filtering of bad quality images. The road surface prediction results have been matched and integrated with OpenStreetMap (OSM) road geometries. This study provides global data insights derived from maps and statistics about spatial distribution of Mapillary coverage and road pavedness on a continent and countries scale, with rural and urban distinction. This dataset expands the availability of global road surface information by over 3 million kilometers, now representing approximately 36% of the total length of the global road network. Most regions showed moderate to high paved road coverage (60-80%), but significant gaps were noted in specific areas of Africa and Asia. Urban areas tend to have near-complete paved coverage, while rural regions display more variability. Model validation against OSM surface data achieved strong performance, with F1 scores for paved roads between 91-97% across continents. Taking forward the work of Mapillary and their contributors and enrichment of OSM road attributes, our work provides valuable insights for applications in urban planning, disaster routing, logistics optimisation and addresses various Sustainable Development Goals (SDGS): especially SDGs 1 (No poverty), 3 (Good health and well-being), 8 (Decent work and economic growth), 9 (Industry, Innovation and Infrastructure), 11 (Sustainable cities and communities), 12 (Responsible consumption and production), and 13 (Climate action).

dataset, information, osm segment, (14 more...)

arXiv.org Artificial Intelligence

2410.19874

Country:

North America > Haiti (0.14)
Europe > Germany > Baden-Württemberg (0.04)
Oceania > Australia (0.04)
(42 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.34)

Industry:

Energy (0.93)
Health & Medicine > Consumer Health (0.54)
Transportation > Ground > Road (0.49)
Transportation > Infrastructure & Services (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Knowledge of Pretrained Language Models on Surface Information of Tokens

Hiraoka, Tatsuya, Okazaki, Naoaki

arXiv.org Artificial IntelligenceFeb-22-2024

Do pretrained language models have knowledge regarding the surface information of tokens? We examined the surface information stored in word or subword embeddings acquired by pretrained language models from the perspectives of token length, substrings, and token constitution. Additionally, we evaluated the ability of models to generate knowledge regarding token surfaces. We focused on 12 pretrained language models that were mainly trained on English and Japanese corpora. Experimental results demonstrate that pretrained Figure 1: Input and output examples when asking GPT-language models have knowledge regarding token 3.5 Turbo about the surface information of words (as length and substrings but not token constitution. of 1st, Jan. 2024). The Japanese example has the same Additionally, the results imply that there meaning as the English text, asking the length of and is a bottleneck on the decoder side in terms of third character in 人類学者 (anthropologist).

information, knowledge, surface information, (14 more...)

arXiv.org Artificial Intelligence

2402.09808

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Learning to Teach with Deep Interactions

Fan, Yang, Xia, Yingce, Wu, Lijun, Xie, Shufang, Liu, Weiqing, Bian, Jiang, Qin, Tao, Li, Xiang-Yang, Liu, Tie-Yan

arXiv.org Machine LearningJul-9-2020

Machine teaching [42, 37] uses a meta/teacher model to guide the training of a student model (which will be used in real tasks) through training data selection, loss function design, etc. Previously, the teacher model only takes shallow/surface information as inputs (e.g., training iteration number, loss and accuracy from training/validation sets) while ignoring the internal states of the student model, which limits the potential of learning to teach. In this work, we propose an improved data teaching algorithm, where the teacher model deeply interacts with the student model by accessing its internal states. The teacher model is jointly trained with the student model using meta gradients propagated from a validation set. We conduct experiments on image classification with clean/noisy labels and empirically demonstrate that our algorithm makes significant improvement over previous data teaching methods.

artificial intelligence, machine learning, student model, (17 more...)

arXiv.org Machine Learning

2007.04649

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Russia (0.04)
Europe > France (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback